library(tidyverse)
library(readr)
library(ggplot2)
library(gt)
library(knitr)Waste in Swiss Forests - Report
Waste in Swiss Forests - Report
Abstract
Waste in Swiss Forest is a concern for wildlife and the environment, but also for the perceived beauty of nature. Litter such as plastic bottles, cans or food packaging is often left behind by people going through the forest, and is detrimental to the ecosystem. This project aims to portray the state of waste in Zurichs forests at Hoenggerberg and Kaeferberg in June 2025, and to find solutions for the problem. For this, people in Zurich were asked about how they perceive the state of waste in the forests of Hoenggerberg and Kaeferberg. It was investigated, where they saw which types of waste in the forest and what activities were likely causing this waste. A special focus was placed on the impact of sports in the forest. Finally, participants selected possible measures to remove waste from the forest.
It was found that most litter in the forest lands around picnic areas and benches, in the form of consumer goods such as plastic bottles, cans and cigarette butts. While participants generally have a positive view of the waste situation in the forest (4 out of 5), they believe that activities such as barbecue, picnic or camping cause the most waste for the forest. Sports activities were not perceived to have an extra impact on waste.
For measures against waste, participants prefer actions that are tied to personal responsibility first, such as trash-removal by the ones who left it, and to install more bins in the forest for this purpose. Fines for those who litter should further enforce this direct responsibility. Clean-up of waste by authorities or volunteers received the second-most votes.
Methods
Data gathering
For this project, a total of 27 participants were asked about their assessment of waste in forests around ETH Hoenggerberg. 12 people were approached in the forest on the evening of Sunday, 1st of June 2025. They were asked to fill out a Google survey about how they perceive waste in that forest in general, and specifically on that day. The following screenshot displays the briefing that the participants received before starting the survey. The photo on the right shows me in the process of looking for participants in the forest.
site plan
The following satellite view show the site of the survey: The survey was conducted in the two forests “Hoenggerberg forest” and Kaeferberg forest in the north of Zurich.
The survey was conducted near the Campus of ETH Hoenggerberg. The student housing on the campus is marked with a red square and is situated close to the Hoenggerberg forest. In fact, many participants of this survey are living in these apartment blocks.
The QR-codes indicate where flyers with a link to the online survey were hung up. The list icon with a green frame indicates the two positions near the forest entrances where participants were asked to take the survey in person.
The survey consists of 19 questions, separated into 5 sections:
Waste in the forest
(6 questions, 2 optional)Forest activities
(4 questions)Sports in the forest
(4 questions)Measures against waste in the forest
(3 questions)A section about the user’s demographic information
(2 questions: age, gender)
A link to the survey can be found here:
German survey
A second survey was created in German, in order to reach more participants. However, the workflow to entertain two surveys simultaneously was later deemed as too complicated. Hence, only two participants (an elderly couple) answered the German survey on the first day, and all following participants were shown the English survey instead.
A link to the German survey can be found here:
https://forms.gle/gXJUSYirHmnU8XbF7
The answers of the 2 German-speaking participants were manually entered into the English version of the Google Form on Thursday, 6th of June, by the study author. Thus, all analysis could be done based on the results from the English survey.
On Monday, 2nd of June 2025, the English survey was then also sent to residents in student housing at ETH Hoenggerberg via social media. Further, flyers with a QR code were hung up in the student housing complex. More flyers were also hung up at different picnic-area spots in the Hoenggerberg and Kaeferberg forests.
To ensure validity, participants who filled out the survey in this manner were reminded again to only fill out the survey if they were in the forest on that day. But it since they were not met in the forest in person, like the first batch of participants, it can’t be verified if they actually spent time in the forest on the day they filled out the survey.
Data processing
The survey was then processed with the help of R-scripts on the R-Studio platform. The first R-script “01-data-download.R” extracts the raw data from the Google Sheet that comes along with the survey. The second R-script “02-data-cleaning” then processes this data into a tidy dataframe, one cell per value, one value per cell. Finally, a Quarto document “index.qmd” is used to write this report within R-Studio.
The R-packages used for this process were tidyverse, readr, ggplot2, gt, knitr, googlesheets4, dplyr and lubridate. During development, Git version control was applied and the progress was stored on a GitHub repository.
The final report was then published via GitHub pages. The R-code and the generated visualizations, with their interpretation, are listed below.
R-packages
Data import
The processed data is imported from the data/processed directory which was created with the second R-script.
survey1 <- read_rds(here::here("data/processed/survey1-processed.rds"))
survey2 <- read_rds(here::here("data/processed/survey2-processed.rds"))
survey_small <- read_rds(here::here("data/processed/survey-small-processed.rds"))Data visualization
In this section, visualization tools learned in the RBTL course are utilized to create different tables and diagrams to highlight certain trends in the data.
Demography and timeline
The following point plot Figure 1 displays when survey was taken and how the users rated the waste situation in the forest around Hoenggerberg. The rating range was from 1 to 5, with 5 being the best (highest cleanliness). It is likely that the given rating reflected the tidyness at in the forest at that time. Most people rated the cleanliness of the Hoenggerberg and Kaeferberg forest between 3 and 4, so there is room for improvement. But the lowest rating of 1 was not given a single time. No trends are visible regarding the gender of the participants.
ggplot(data = survey_small,
mapping = aes(x = timestamp,
y = waste_rating,
color = gender,
shape = gender
)
) +
geom_point(size = 3) +
labs(x = "survey timestamp", y = "perceived forest cleanliness") +
scale_shape(solid = FALSE) +
ylim(1, NA)
# the tip to include scale_shape(solid = FALSE) came from stackoverflow,
# https://stackoverflow.com/a/51775750The figure is a bit hard to read, because most people filled the survey at around the same time. To make it more clear, the following table Table 1 presents the summarized data in a more understandable form:
survey_weekday <- survey_small |>
group_by(weekday, gender) |>
summarize(amount = n())
survey_weekday |>
ungroup() |>
gt()
write_csv(survey_weekday, here::here("data/final/tbl-weekday-gender.csv"))| weekday | gender | amount |
|---|---|---|
| sunday | female | 2 |
| sunday | male | 11 |
| monday | female | 3 |
| monday | male | 6 |
| tuesday | female | 1 |
| wednesday | female | 2 |
| thursday | female | 1 |
| thursday | male | 1 |
All participants were either male or female. The survey also presented the options to chose “non-binary” or “prefer not to answer” for the gender, but these options were never selected.
The second figure Figure 2 also displays the number of answers per day, while also showing a more visible distinction between male and female responses. 12 people were met in-person in the forest on Sunday evening, of which most were male. One additional person answered the survey online later. From Monday to Thursday, responses came only from people answering the questionnaire online. Most of these answers came in on Monday, when the link to the survey was posted in a WhatsApp group chat for students living in student housing on Campus Hoenggerberg, which is only 200 meters away from the Kaeferberg forest. Most of these participants were also male.
# same dataframe as in the code block above, so I don't use write_csv
survey_weekday <- survey_small |>
group_by(weekday, gender) |>
summarize(amount = n())
# adding the white labelling in this plot was done with the help of ChatGPT, prompt:
# https://chatgpt.com/share/6841b0b5-c2e4-8011-b70d-c15aacb89456
ggplot(data = survey_weekday,
mapping = aes(x = weekday,
y = amount,
fill = gender)) +
geom_col(position = "stack") +
geom_text(aes(label = amount),
position = position_stack(vjust = 0.5),
color = "white") +
labs(title = "Number of answers per day",
x = "weekday the survey was taken",
y = "number of answers")The following histogram Figure 3 highlights the demography of the participants:
survey_demography <- survey_small
write_csv(survey_demography, here::here("data/final/fig-demography.csv"))
ggplot(data = survey_demography,
mapping = aes(x = age, fill = gender)) +
geom_histogram(color = "black") +
labs(title = "age of participants in years",
subtitle = "27 participants total",
x = "age in years",
y = "count") +
theme_minimal()Most participants were between the ages of 20-40, since a lot of participants came from the student housing apartments near the forest entrance.
Statistical summary
The following tables and diagrams provide a statistical summary of the numerical values collected from the survey.
In the first question of the survey, people were shown a picture of a plastic cup on forest ground. They were asked about their feeling when seeing trash in the forest, from 1 to 5 (1 = very upset, 5 = very happy.)
Figure 4 shows that all participants reported a bad feeling (level 1 or 2) after seeing waste in the forest. There are no obvious trends from gender. But it is remarkable that no person above 40 reported a feeling level of 2. It could be concluded that the older the forest user, the more they are appaled by seeing waste in the forest. However, the small number of participants cannot justify this claim.
ggplot(data = survey_small,
mapping = aes(x = age,
y = waste_feeling,
color = gender,
shape = gender
)
) +
geom_point(size = 3) +
labs(title = "Particpant's mood level when seeing waste in the forest",
x = "age in years", y = "feeling when seeing forest waste")The first table ?@tbl-statistics1 lists the average and median ages of the participants, grouped by gender. The average age is close to the median age, around 33 to 34. However, the standard deviation is quite large, so there are outliers: Some participants were much older than others. The maximum age was 63, as shown above in @fig_demography .
The table also provides info about how the participants rate the cleanliness of Zurichs forests on average, and the median. Since the standard deviation is low and the mean is close to the average, it can be concluded that most people rate the cleanliness between 3.5 and 4 and there are no significant outliers. However, the range (from 1 to 5) is quite small, so this has to be kept in mind.
The greatest agreement is in the participant’s feeling when seeing waste in the forest: All participants report a bad feeling (1 or 2) when seeing forest waste. The average and the median are close to 1, and the standard deviation is low.
In conclusion: While participants deem that Zurichs forests are clean, any small amount of waste seen is perceived as very negative. So, according to participants, even small amounts of waste have to be avoided at all costs.
survey_statistics <- survey_small |>
select(gender, age, waste_feeling, waste_rating, activities_frequency, sports_value, measures_frequency)
write_csv(survey_statistics, here::here("data/final/survey_statistics.csv"))
survey_table_gender1 <- survey_statistics |>
select(gender, age, waste_rating, waste_feeling) |>
group_by(gender) |> # grouped by gender
summarise(
count = n(),
age_mean = mean(age),
age_sd = sd(age),
age_median = median(age),
rating_mean = mean(waste_rating),
rating_sd = sd(waste_rating),
rating_median = median(waste_rating),
feeling_mean = mean(waste_feeling),
feeling_sd = sd(waste_feeling),
feeling_median = median(waste_feeling)
)
write_csv(survey_table_gender1, here::here("data/final/tbl-statistics1.csv"))#|label: tbl-statistics1
#|tbl-cap: "Statistical data of the survey, part 1"
survey_table_gender1 |>
gt() |>
fmt_number(decimals = 1) |>
tab_header(title = "statistical table of participants age, their forest-rating and their feeling when seeing forest waste",
subtitle = "data from 27 forest users in June 2025, grouped by gender")| statistical table of participants age, their forest-rating and their feeling when seeing forest waste | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| data from 27 forest users in June 2025, grouped by gender | ||||||||||
| gender | count | age_mean | age_sd | age_median | rating_mean | rating_sd | rating_median | feeling_mean | feeling_sd | feeling_median |
| female | 9.0 | 33.3 | 13.6 | 34.0 | 3.6 | 0.9 | 4.0 | 1.2 | 0.4 | 1.0 |
| male | 18.0 | 26.4 | 10.5 | 22.0 | 3.7 | 0.9 | 4.0 | 1.4 | 0.5 | 1.0 |
The same conclusions can be made from the following graphic, which shows yet again how participants rate the cleanliness of Zurichs forests, charted by age and gender.
ggplot(data = survey_small,
mapping = aes(x = age,
y = waste_rating,
color = gender,
shape = gender)
) +
geom_point(size = 3) +
labs(x = "age in years",
y = "perceived forest cleanliness") +
ylim(1, NA)The next table ?@tbl-statistics2 shows more statistical insights. It shows that the numbers for weekly forest visits and the value of forest sports are close for both women and men. Men tend to visit the forest slightly more often than women, and rate the value of sports activities in the forest higher. But the difference between the two genders is small when compared to the standard deviations.
A bigger difference can be seen for the suggested number of removal-days per year. Women are generally in favor of more waste-removal-days per year than the men, as shown in the mean and median values. However, the male participants propose a larger range of removal days by authorities, with a minimum of 6 and a maximum of 180. Although, since there are many more men that participated in the survey, this bigger number of responses could be the cause for the wider numerical span, and doesn’t necessarily have to be due to the gender.
#|label: tbl-statistics2
#|tbl-cap: "statistical data of the survey, part 2"
survey_table_gender2 <- survey_statistics |>
filter(!is.na(activities_frequency)) |> # remove NA values
select(gender, activities_frequency, sports_value, measures_frequency) |>
group_by(gender) |> # grouped by gender
summarise(
count = n(),
visits_mean = mean(activities_frequency),
visits_sd = sd(activities_frequency),
visits_median = median(activities_frequency),
sports_value_mean = mean(sports_value),
sports_value_sd = sd(sports_value),
sports_value_median = median(sports_value),
removal_days_mean = mean(measures_frequency),
removal_days_median = median(measures_frequency),
removal_days_min = min(measures_frequency),
removal_days_max = max(measures_frequency)
)
write_csv(survey_table_gender2, here::here("data/final/tbl-statistics2.csv"))
survey_table_gender2 |>
gt() |>
fmt_number(decimals = 1) |>
tab_header(title = "statistical table of participants visiting frequency, their value of forest sports, and their suggested waste-removal days per year",
subtitle = "data from 27 forest users in June 2025, grouped by gender")| statistical table of participants visiting frequency, their value of forest sports, and their suggested waste-removal days per year | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| data from 27 forest users in June 2025, grouped by gender | |||||||||||
| gender | count | visits_mean | visits_sd | visits_median | sports_value_mean | sports_value_sd | sports_value_median | removal_days_mean | removal_days_median | removal_days_min | removal_days_max |
| female | 9.0 | 2.8 | 1.2 | 3.0 | 4.1 | 0.9 | 4.0 | 46.9 | 52.0 | 12.0 | 100.0 |
| male | 18.0 | 3.1 | 2.1 | 2.5 | 4.1 | 1.3 | 5.0 | 37.6 | 24.0 | 6.0 | 180.0 |
The distribution of suggested removal days is also shown in the following diagram Figure 6 . It seems that most people desire regular waste-removal by authorities every week, indicated by the large amount of data points around the number 52. But some are also satisfied with a monthly waste-removal schedule, which corresponds to 12 removal days per year.
ggplot(data = survey_small,
mapping = aes(x = age,
y = measures_frequency,
color = gender,
shape = gender
)
) +
geom_point(size = 3) +
labs(title = "number of days per year where authorities remove forest waste",
x = "age in years", y = "suggested days of waste-removal")The next bar plot Figure 7 shows the relation between the participants’ forest-visiting frequency and their rating of the forest’s cleanliness. Generally, people who visit the forest more often tend to give a higher rating.
ggplot(data = survey_small,
mapping = aes(x = waste_rating,
fill = activities_frequency
)
) +
geom_bar(width = 0.2) +
labs(title = "Rating of Zurich's forest cleanliness",
subtitle = "answers from 24 forest users")want: a plot that shows perceived cleanliness average per day